20 research outputs found

    Analyse Markovienne des Stratégies d'Evolution

    Get PDF
    In this dissertation an analysis of Evolution Strategies (ESs) using the theory of Markov chains is conducted. Proofs of divergence or convergence of these algorithms are obtained, and tools to achieve such proofs are developed.ESs are so called "black-box" stochastic optimization algorithms, i.e. information on the function to be optimized are limited to the values it associates to points. In particular, gradients are unavailable. Proofs of convergence or divergence of these algorithms can be obtained through the analysis of Markov chains underlying these algorithms. The proofs of log-linear convergence and of divergence obtained in this thesis in the context of a linear function with or without constraint are essential components for the proofs of convergence of ESs on wide classes of functions.This dissertation first gives an introduction to Markov chain theory, then a state of the art on ESs and on black-box continuous optimization, and present already established links between ESs and Markov chains.The contributions of this thesis are then presented:o General mathematical tools that can be applied to a wider range of problems are developed. These tools allow to easily prove specific Markov chain properties (irreducibility, aperiodicity and the fact that compact sets are small sets for the Markov chain) on the Markov chains studied. Obtaining these properties without these tools is a ad hoc, tedious and technical process, that can be of very high difficulty.o Then different ESs are analyzed on different problems. We study a (1,\lambda)-ES using cumulative step-size adaptation on a linear function and prove the log-linear divergence of the step-size; we also study the variation of the logarithm of the step-size, from which we establish a necessary condition for the stability of the algorithm with respect to the dimension of the search space. Then we study an ES with constant step-size and with cumulative step-size adaptation on a linear function with a linear constraint, using resampling to handle unfeasible solutions. We prove that with constant step-size the algorithm diverges, while with cumulative step-size adaptation, depending on parameters of the problem and of the ES, the algorithm converges or diverges log-linearly. We then investigate the dependence of the convergence or divergence rate of the algorithm with parameters of the problem and of the ES. Finally we study an ES with a sampling distribution that can be non-Gaussian and with constant step-size on a linear function with a linear constraint. We give sufficient conditions on the sampling distribution for the algorithm to diverge. We also show that different covariance matrices for the sampling distribution correspond to a change of norm of the search space, and that this implies that adapting the covariance matrix of the sampling distribution may allow an ES with cumulative step-size adaptation to successfully diverge on a linear function with any linear constraint.Finally, these results are summed-up, discussed, and perspectives for future work are explored.Cette thèse contient des preuves de convergence ou de divergence d'algorithmes d'optimisation appelés stratégies d'évolution (ESs), ainsi que le développement d'outils mathématiques permettant ces preuves.Les ESs sont des algorithmes d'optimisation stochastiques dits ``boîte noire'', i.e. où les informations sur la fonction optimisée se réduisent aux valeurs qu'elle associe à des points. En particulier, le gradient de la fonction est inconnu. Des preuves de convergence ou de divergence de ces algorithmes peuvent être obtenues via l'analyse de chaînes de Markov sous-jacentes à ces algorithmes. Les preuves de convergence et de divergence obtenues dans cette thèse permettent d'établir le comportement asymptotique des ESs dans le cadre de l'optimisation d'une fonction linéaire avec ou sans contrainte, qui est un cas clé pour des preuves de convergence d'ESs sur de larges classes de fonctions.Cette thèse présente tout d'abord une introduction aux chaînes de Markov puis un état de l'art sur les ESs et leur contexte parmi les algorithmes d'optimisation continue boîte noire, ainsi que les liens établis entre ESs et chaînes de Markov. Les contributions de cette thèse sont ensuite présentées:o Premièrement des outils mathématiques généraux applicables dans d'autres problèmes sont développés. L'utilisation de ces outils permet d'établir aisément certaines propriétés (à savoir l'irreducibilité, l'apériodicité et le fait que les compacts sont des small sets pour la chaîne de Markov) sur les chaînes de Markov étudiées. Sans ces outils, établir ces propriétés était un processus ad hoc et technique, pouvant se montrer très difficile.o Ensuite différents ESs sont analysés dans différents problèmes. Un (1,\lambda)-ES utilisant cumulative step-size adaptation est étudié dans le cadre de l'optimisation d'une fonction linéaire. Il est démontré que pour \lambda > 2 l'algorithme diverge log-linéairement, optimisant la fonction avec succès. La vitesse de divergence de l'algorithme est donnée explicitement, ce qui peut être utilisé pour calculer une valeur optimale pour \lambda dans le cadre de la fonction linéaire. De plus, la variance du step-size de l'algorithme est calculée, ce qui permet de déduire une condition sur l'adaptation du paramètre de cumulation avec la dimension du problème afin d'obtenir une stabilité de l'algorithme. Ensuite, un (1,\lambda)-ES avec un step-size constant et un (1,\lambda)-ES avec cumulative step-size adaptation sont étudiés dans le cadre de l'optimisation d'une fonction linéaire avec une contrainte linéaire. Avec un step-size constant, l'algorithme résout le problème en divergeant lentement. Sous quelques conditions simples, ce résultat tient aussi lorsque l'algorithme utilise des distributions non Gaussiennes pour générer de nouvelles solutions. En adaptant le step-size avec cumulative step-size adaptation, le succès de l'algorithme dépend de l'angle entre les gradients de la contrainte et de la fonction optimisée. Si celui ci est trop faible, l'algorithme convergence prématurément. Autrement, celui ci diverge log-linéairement.Enfin, les résultats sont résumés, discutés, et des perspectives sur des travaux futurs sont présentées

    Markov Chain Analysis of Evolution Strategies on a Linear Constraint Optimization Problem

    Get PDF
    This paper analyses a (1,λ)(1,\lambda)-Evolution Strategy, a randomised comparison-based adaptive search algorithm, on a simple constraint optimisation problem. The algorithm uses resampling to handle the constraint and optimizes a linear function with a linear constraint. Two cases are investigated: first the case where the step-size is constant, and second the case where the step-size is adapted using path length control. We exhibit for each case a Markov chain whose stability analysis would allow us to deduce the divergence of the algorithm depending on its internal parameters. We show divergence at a constant rate when the step-size is constant. We sketch that with step-size adaptation geometric divergence takes place. Our results complement previous studies where stability was assumed.Comment: Amir Hussain; Zhigang Zeng; Nian Zhang. IEEE Congress on Evolutionary Computation, Jul 2014, Beijing, Chin

    Markov Chain Analysis of Cumulative Step-size Adaptation on a Linear Constrained Problem

    Get PDF
    This paper analyzes a (1, λ\lambda)-Evolution Strategy, a randomized comparison-based adaptive search algorithm, optimizing a linear function with a linear constraint. The algorithm uses resampling to handle the constraint. Two cases are investigated: first the case where the step-size is constant, and second the case where the step-size is adapted using cumulative step-size adaptation. We exhibit for each case a Markov chain describing the behaviour of the algorithm. Stability of the chain implies, by applying a law of large numbers, either convergence or divergence of the algorithm. Divergence is the desired behaviour. In the constant step-size case, we show stability of the Markov chain and prove the divergence of the algorithm. In the cumulative step-size adaptation case, we prove stability of the Markov chain in the simplified case where the cumulation parameter equals 1, and discuss steps to obtain similar results for the full (default) algorithm where the cumulation parameter is smaller than 1. The stability of the Markov chain allows us to deduce geometric divergence or convergence , depending on the dimension, constraint angle, population size and damping parameter, at a rate that we estimate. Our results complement previous studies where stability was assumed.Comment: Evolutionary Computation, Massachusetts Institute of Technology Press (MIT Press): STM Titles, 201

    A Generalized Markov-Chain Modelling Approach to (1,λ)(1,\lambda)-ES Linear Optimization: Technical Report

    Get PDF
    Several recent publications investigated Markov-chain modelling of linear optimization by a (1,λ)(1,\lambda)-ES, considering both unconstrained and linearly constrained optimization, and both constant and varying step size. All of them assume normality of the involved random steps, and while this is consistent with a black-box scenario, information on the function to be optimized (e.g. separability) may be exploited by the use of another distribution. The objective of our contribution is to complement previous studies realized with normal steps, and to give sufficient conditions on the distribution of the random steps for the success of a constant step-size (1,λ)(1,\lambda)-ES on the simple problem of a linear function with a linear constraint. The decomposition of a multidimensional distribution into its marginals and the copula combining them is applied to the new distributional assumptions, particular attention being paid to distributions with Archimedean copulas

    Cumulative Step-size Adaptation on Linear Functions: Technical Report

    Get PDF
    The CSA-ES is an Evolution Strategy with Cumulative Step size Adaptation, where the step size is adapted measuring the length of a so-called cumulative path. The cumulative path is a combination of the previous steps realized by the algorithm, where the importance of each step decreases with time. This article studies the CSA-ES on composites of strictly increasing with affine linear functions through the investigation of its underlying Markov chains. Rigorous results on the change and the variation of the step size are derived with and without cumulation. The step-size diverges geometrically fast in most cases. Furthermore, the influence of the cumulation parameter is studied

    Verifiable Conditions for the Irreducibility and Aperiodicity of Markov Chains by Analyzing Underlying Deterministic Models

    Get PDF
    International audienceWe consider Markov chains that obey the following general non-linear state space model: Φk+1=F(Φk,α(Φk,Uk+1))\Phi_{k+1} = F(\Phi_k, \alpha(\Phi_k, U_{k+1})) where the function FF is C1C^1 while α\alpha is typically discontinuous and {Uk:kZ>0}\{U_k: k \in \mathbb{Z}_{>0} \} is an independent and identically distributed process. We assume that for all xx, the random variable α(x,U1)\alpha(x, U_1) admits a density pxp_x such that (x,w)px(w)(x, w) \mapsto p_x(w) is lower semi-continuous.We generalize and extend previous results that connect properties of the underlying deterministic control model to provide conditions for the chain to be φ\varphi-irreducible and aperiodic. By building on those results, we show that if a rank condition on the controllability matrix is satisfied for all xx, there is equivalence between the existence of a globally attracting state for the control model and φ\varphi-irreducibility of the Markov chain. Additionally, under the same rank condition on the controllability matrix, we prove that there is equivalence between the existence of a steadily attracting state and the φ\varphi-irreducibility and aperiodicity of the chain. The notion of steadily attracting state is new. We additionally derive practical conditions by showing that the rank condition on the controllability matrix needs to be verified only at a globally attracting state (resp.\ steadily attracting state) for the chain to be a φ\varphi-irreducible T-chain (resp.\ φ\varphi-irreducible aperiodic T-chain).Those results hold under considerably weaker assumptions on the model than previous ones that would require (x,u)F(x,α(x,u))(x,u) \mapsto F(x,\alpha(x,u)) to be CC^\infty (while it can be discontinuous here). Additionally the establishment of a \emph{necessary and sufficient} condition for the φ\varphi-irreducibility and aperiodicity without a structural assumption on the control set is novel---even for Markov chains where (x,u)F(x,α(x,u))(x,u) \mapsto F(x,\alpha(x,u)) is CC^\infty.We illustrate that the conditions are easy to verify on a non-trivial and non-artificial example of Markov chain arising in the context of adaptive stochastic search algorithms to optimize continuous functions in a black-box scenario
    corecore